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ABSTRACT 

Context. Observations and theoretical calculations have shown the importance of non-spherically symmetric structures in supernovae. 
Thus, the interpretation of observed supernova spectra requires the ability to solve the transfer equation in 3-D moving atmospheres. 
Aims. We present an implementation of the solution of the radiative transfer equation in 3-D homologously expanding atmospheres 
in spherical coordinates. The implementation is exact to all orders in v/c. 

/Wefhods. We use the methods that we have developed in previous papers in this series as well as a new afiine method that makes use 
of the fact that photons travel on straight lines. The afiine method greatly facilitates delineating the characteristics and can be used in 
the case of strong-gravitational and arbitrary- velocity fields. 

Results. We compare our results in 3-D for spherically symmetric test problems with high velocity fields (up to 87% of the speed 
of light and find excellent agreement, when the number of momentum space angles is high. Our well-tested 1-D results are based 
on methods where the momentum directions vary along the characteristic (co-moving momentum directions). Thus, we are able to 
verify both the analytic framework and its numerical implementation. Additionally, we have been able to test the parallelization over 
characteristics. Using 512" momentum angles we ran the code on 16,384 Opteron processors and achieved excellent scaling. 
Conclusions. It is now possible to calculate synthetic spectra from realistic 3D hydro simulations. This should open an era of progress 
in hydro modeling, similar to that that occurred in the 1980s when 1-D models were confronted with synthetic spectra. 
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1. Introduction 

Supernovae of all types are known to deviate significantly from 
spherical symmetry. The evidence comes from both flux spec- 
tra, but particularly from the interpretation of spectropolarime- 
try (see [Wang & Wheeler||2008] and references therein). In the 
case of core-collapse supernovae, the asymmetry is thought to 
be due to the underlying central engine which is probably asym- 
metric and this leads to geometrically asymmetric ejecta, with 
the asymmetry growing as one gets closer to the central en- 
gine (thus "stripped" supernovae such as Type Ic are signifi- 
cantly more asymmetric than supernovae with intact hydrogen 
envelopes such as Type IIP). Type la (thermonuclear) super- 
novae are thought to be geometrically rather round but the com- 
position is thought to be asymmetrical. Since the light curve of 
Type la supernovae is powered by the radioactive decay of ^''Ni 
and its products, asymmetries in the ^''Ni distribution will lead 
to asymmetries in the ionization fractions and opacities that will 
produce polarization and alter the flux spectra. Thus, particularly 
in Type la supernovae one can accurately calculate light curves 
and spectra assuming homologous flow (v oc r) but including the 
geometrical or compositional asymmetry in three dimensions. 

In this series of papers ( Hauschildt & Baron|2006 Baron & 
|Hauschildt|[2007l [Hauschildt & Baron||2008| |2009| henceforth 
Papers I-IV) we have built up the full characteristics method of 
solving the transfer equation in 3-D for static atmospheres in a 
variety of geometries. Here we build on the results of Paper IV 



for spherical geometry as well as those of Chen et al. ( 2QQ7\ for 
the affine method. 



2. Transfer Equation 



|Chen et al. ( |2007[ ) showed that the transfer equation in flat space- 
time could be written in terms of an affine parameter and that the 
right hand side could be evaluated in the co-moving frame pro- 
vided that the wavelength (or frequency) was evaluated in the 
co-moving frame. However, the momentum directions could be 
held constant and coincide with those of the observer's frame. 

We define the rest frame photon direction in spherical coor- 
dinates as 



n = (l,0„,0„), |n| = 1, 

or in Cartesian coordinates 

n = (sin 6„ cos 0„, sin 0„ sin <p„,cos 9n). 

and the starting position of the photon 

ro = (ro, 6*0, 0o)- 

The 3-D geodesic can be parametrized as 
r(s) - ro + ns. 



(1) 



(2) 



(3) 



(4) 
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where s is the rest frame physical distance related to the affine 
parameter ^ by 



h 

/loo 

and is measured starting from Tq. This gives us 

dr _ , n ■ To + 5 n ■ r 

ds r r ' 



(5) 

(6) 
(7) 



Now we can define an effective optical depth, 

(a{s)Ai , 
XAfis) + Aa{s) + _ ^ \ds 

= Xds, 



(18) 
(19) 



which defines x- We can also define the traditional source func- 
tion S A - J]a/Xa, so that Eq. ( 17 i becomes 



dh J ^Xa L „ , , a(s) /tf-i/y, 
-1- ^ h + — \SAf(s) + — 



= h + S 



A, 



(20) 
(21) 



and 



r = |r| = |ro + ni| = ^rl + 2{n-ro)s + s^, 
where 

n ■ To = ro[sin0o sin0„ cos(0„ - (^o) + cos 00 cos 0„] 

= xop_,- + yoPy + ZoPz- 

Note also that 

dr dr 



(9) 



(10) 



From Eqs. Q above and (18) of |Chen et al] ( [20071 ), we find 



dh , dA dh , /loo ^ dA A^ 
OS ds oA A A ds A 



y(r)[l-r/3(r)]=f(s). 



with 

Aco 

T " 

From Eq. ( [T2| we find 

IdA _ (J3/r){l-r^)-y^f3'r(p-r) 

Ads ~ 1 - rp(r) 

= a(s). 

Now we have 

^^'^ U + a(s)A^ = -IXAfis) + 5a(s)]h + T]Af(s). 



(11) 



(12) 



ds 



dA 



(13) 



(14) 



Finally, this can be put into the standard form used in PHOENIX 
dHauschildt & Baron|2004b||T999] l 



dl d 

+ a(s)—(AlA) + 4-ais)lA = -XAf(s)h + mf(s), 
OS oA 



(15) 



where a(s) is given by Eq. ( 13 1, and f{s) is given by Eq. ( 12 1 



In order to finite difference this equation we need to ex 



pUcitly difference the ^(AIa) term. As described in 
( |2007| l we can write 



Chen et al. 



OA 



Al - 



which defines S a- 

The more sophisticated discretization in A described in 

can also be 



(8) Hauschildt & Baron (2004a) or Knop et al. (2009 



implemented. For the case of arbitrary velocity fields the method 
of Knop et al. ( 2009)) will be required. Nevertheless, this straight- 
forward method yields excellent results when compared with the 
more sophisticated treatment used in the 1-D code (see below). 



3. Angular Integration 

To solve the scattering problem in the co-moving frame, we 
need to calculate the mean intensity and A* in the co-moving 
frame. Recall that the specific intensity is calculated in a frame 
where five of the six phase-space variables are actually ob- 
server's frame quantities. In particular, the two momentum di- 
rections are fixed observer's frame quantities. Thus, we need to 
perform the angular integration in the co-moving frame. We have 



u = r[i>y8]> MO = [1,0,0,0], 

and 

P=-[l,n]. 
Then 



d^J\''-^\ dno 



(22) 



(23) 



Now 



u ■ p 



he 



he 

u- p^'^jV^-p-^], uo-p^—. 

A A 

Thus 

dO. = (7[1 -j3-n]r^dno 
= (r[l -/3(r)r])-2flfQ„ 

= fisy^dno. 



3.0.1 . Computation of A* 



(24) 



(25) 



(16) 



Then, Eq. ( 15 ) can be written as: 



dh 
ds 



a(s) 



^Aa{s)+ XAfis) 



a(s)- — + r\xf{s). 

Al - /t/_i 



(17) 



The computation of A* proceeds following the same procedure 
as in Paper I. As demonstrated by ( [Olson et al.|1986 ) and ( Olson 
|& Kunasz|1987| ), the coefficients a, j3, and y can be used to con- 
struct diagonal and tri-diagonal A* operators for ID radiation 
transport problems. In fact, up to the full A matrix can be con- 
structed by a straightforward extension of the idea ( [Hauschildt 
[eTall [199^1 [Hauschildt & Baron|[200 4?). These non-local A* 
operators not only lead to excellent convergence rates but they 
avoid the problem of false convergence that is inherent in the A 
iteration method and can also be an issue for diagonal (purely 
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local) A* operators. Therefore, it is highly desirable to imple- 
ment a non-local A* for the 3D case. The tri-diagonal operator 
in the ID case is simply a nearest neighbor A* that considers 
the interaction of a point with its two direct neighbors. In the 3D 
case, the nearest neighbor A* considers the interaction of a voxel 
with the (up to) 3^ - 1 = 26 surrounding voxels (this definition 
considers a somewhat larger range of voxels than a strictly face- 
centered view of just 6 nearest neighbors). This means that the 
non-local A* requires the storage of 27 (26 surrounding voxels 
plus local, i.e., diagonal effects) times the total number of voxels 
A* elements. 

The construction of the A* operator proceeds in the same 
way as discussed in Hauschildt ( ,1992) and Paper I. In the 3D 
case, the 'previous' and 'next' voxels along each characteristic 
must be known so that the contributions can be attributed to the 
correct voxel. Therefore, we use a data structure that attaches 
to each voxel its effects on its neighbors. The scheme can be 
extended trivially to include longer range interactions for better 
convergence rates (in particular on larger voxel grids). However, 
the memory requirements to simply store A* ultimately scales 
like where n is the total number of voxels. The storage re- 
quirements can be reduced by, e.g., using A*'s of different widths 
for different voxels. Storage requirements are not so much a 
problem if a domain decomposition parallelization method is 
used and enough processors are available. 

We describe here the general procedure of calculating the 
A* with arbitrary bandwidth, up to the full A-operator, for the 
method in spherical symmetry (IHauschildt et al. 1994]) . The con- 
struction of the A* is described in ( |01son & Kunasz||1987 l), so 
that we here summarize the relevant formulae. In the method of 
( [Olson & Kunasz 1987 ), the elements of the row of A* are com- 
puted by setting the incident intensities (boundary conditions) 
to zero and setting S(i_t,iy,i,) - 1 for one voxel {i^,iy,i^) and 
performing a formal solution analytically. 

We describe the construction of A* using 
the example of a single characteristic. The con- 
tributions to the A* at a voxel j are given by 



therefore, the invariant quantity should be We immediately 
have 



A,j = 


for / < 


j 


- 1 


(26) 


Vij = .f(sj-i)7j-i 


for / = 


J 


- 1 


(27) 


^JJ = A,-ijexp(-AT,_i) + f(sj)/3) 


for / 






(28) 


Aj+ij = Ajjexp(-ATj) + f{sj+i)aj+i 


for / = 


J 


+ 1 


(29) 


A, J = A,-_ijexp(-AT,-_i) 


for j + 


1 


< / 


(30) 



These contributions are computed along a characteristic, here / 
labels the voxels along the characteristic under consideration. 
These contributions are integrated over solid angle with the 
same method (either deterministic or through the Monte-Carlo 
integration) that is used for the computation of the J. For a 
nearest neighbor A*, the process of Eq. [30] is stopped with 
; = 7 +1, otherwise it is continued until the required bandwidth 
has been reached (or the characteristic has reached an outermost 
voxel and terminates). Comparing with the results of Paper I, 
the A* operator is altered simply by the Doppler-shift factor f(s) 
at the appropriate point. 

4. From Co-moving Frame to Global Inertial Frame 

The specific intensity I\ is observer dependent, it is related to the 
observer invariant phase space distribution F{x, p) by 

h^--(u-pfF{x,p), (31) 
h 



A 



(32) 



We do not need to transform the direction vector, because when 
we write down our transfer equation, the only co-moving quan- 
tity we used is the co-moving wavelength, the other two mo- 
mentum space variables (e.g., nj,n,,)are global inertial (for the 
case we are working on, the n vector is the direction of photon 
in physical space, not the direction seen by the co-moving ob- 
server). For our flat spacetime case, if we want the direction of 
photon seen by u" - y[\,l3], we simply need to do a Lorentz 
boost, for example using equation 1 1.98 of Jackson ( 1975 1. 



5. Application examples 

As a first step we have built upon the MPI parallelized Fortran 95 
program described in Papers I- VI. The parallelization of the for- 
mal solution is presently implemented over solid angle space as 
this is the simplest parallelization option and also one of the most 
efficient (a domain decomposition parallelization method will be 
discussed in a subsequent paper). In addition, the Jordan solver 
of the Operator splitting equations is parallelized with MPI. The 
number of parallelization related statements in the code is small. 

Our basic continuum scattering test problem is similar to that 
discussed in |Hauschildt| ( |1992| ), pauschildt & Baron| ( |2004a) and 
Papers TIL This test problem covers a large dynamic range of 
about 9 dex in the opacities and overall optical depth steps along 
the characteristics and, in our experience, constitutes a reason- 
ably challenging setup for the radiative transfer code. The ap- 
plication of the 3D code to 'real' problems is in preparation and 
requires a substantial amount of development work (in progress). 
Comparing this test case to real world problems in ID we have 
found that this test is close to the worst case scenario and that 
convergence, etc is generally better in real world problems. We 
use a sphere with a grey continuum opacity parametrized by a 
power law in the continuum optical depth Tstd- The basic model 
parameters are 

1. Inner radius R^^ - lO'-'cm, outer radius Rant - 1.01 x 
10'^ cm. 

2. Minimum optical depth in the continuum t™^" = 10 and 
maximum optical depth in the continuum r™" = 10^. 

3. Grey temperature structure with T^odeX - 10"^ K. 

4. Outer boundary condition 7^^ = and diffusion inner bound- 
ary condition for all wavelengths. 

5. Continuum extinction - Clr^, with the constant C fixed 
by the radius and optical depth grids. 

6. Parametrized coherent & isotropic continuum scattering by 
defining 



Xc 



EcKc + (1 - eo)cr,. 



(33) 



with < ^ ^- Kc ™d (Tc are the continuum absorption and 
scattering coefficients. 

The test model is just an optically thick sphere put into the 
3D grid. This problem is used because the results can be di- 
rectly compared with the results obtained with our ID spherical 
radiation transport code ( |Hauschiidt|| 1 992| to assess the accu- 
racy of the method. The sphere is centered at the center of the 
Cartesian grid, which is in each axis 10% larger than the radius 
of the sphere. The solid angle space was discretized in (6, (p) with 
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ng - if not stated otherwise. In the following we discuss the 
results of various tests. In all tests we use the full characteristics 
method for the 3D RT solution. 

5.1. LIE tests 

In this test we have set e = 1 to test the accuracy of the for- 
mal solution by comparing to the results of the ID code. The ID 
solver uses 64 radial points, distributed logarithmically in op- 
tical depth. For the 3D solver we tested 'moderate' grids with 
Hr = = 2 * 32 -H 1 and rig = 2 * 16 -i- 1 points along each 
axis, for a total of 65^ * 33 ~ 1.4 x 10^ voxels. The momen- 
tum space discretization uses in general, ng - - 256 points. 
In Fig. [T] we show the mean intensities as a function of distance 
from the center for both the ID (-H symbols) and the 3D solver 
For the 3D results J is plotted at every voxel on the surface and 
the spherical symmetry is reproduced very well at every point 
on the surface. The results show excellent agreement between 
the two solutions, thus the 3D RT formal solution is comparable 
in accuracy to the ID formal solution. Fig. [T] shows the results 
for four different maximum velocities, and clearly indicates that 
the 3D RT solution is just as accurate as the ID code. This is 
actually quite a nice demonstration since the affine method used 
in the 3D code is completely different from the full co-moving 
momentum space method used in the 1-D code, that is, the spe- 
cific intensity is solved for in different frames in the two codes. 
However, since J depends only on the co-moving frequency they 
can be directly compared in the same frame. 

As shown in Paper I, for the conditions used in these tests 
a larger number of solid angle points significantly improves the 
accuracy of the mean intensities. Our tests show that reasonable 
accuracy can be achieved with as few as 16^ momentum space 
points, but in these test calculations we have used more points 
in order to really compare the 3D results to the ID results. A 
full investigation of the number of angle points needed for 
realistic asymmetric calculations will be the subject of future 
work. 

The line of the simple 2-level model atom is parametrized by 
the ratio of the profile averaged line opacity xi to the continuum 
opacity Xc and the line thermalization parameter e/. For the test 
cases presented below, we have used e^. - 1 and a constant tem- 
perature and thus a constant thermal part of the source function 
for simplicity (and to save computing time) and set;^'//^c - 10^ 
to simulate a strong line, with e; = (1.0,0.1) (see below). With 
this setup, the optical depths as seen in the line range from 10"^ 
to 10*'. We use 92 wavelength points to model the full line pro- 
file, including wavelengths outside the line for the continuum. 
We did not require the line to thermalize at the center of the test 
configurations, this is a typical situation one encounters in a full 
3D configurations as the location (or even existence) of the ther- 
malization depths becomes more ambiguous than in the ID case. 

The sphere is put at the center of the Cartesian grid, which is 
in each axis 10% larger than the radius of the sphere. For the test 
calculations we use voxel grids with the same number of spatial 
points in each direction (see below). The solid angle space was 
discretized in {9, <p) with ng - nip Unless otherwise stated, the 
tests were run on parallel computers using a variety of number 
of CPUs, architectures, and interconnects. 

5.2. LIE line and continuum test 

We first have set e; = 1 to test the accuracy of the formal solu- 
tion by comparing to the results of the ID code. The ID solver 




2.DXI0'' 4.0X10" 6.DXI0" 8.OXIO" I.DXIO" 1.2XI0'^ 



Fig. 1: The results of 1-D calculations are compared with 3-D 
calculations for /3 max = (0.03,0.33,0.67,0.87). The momentum 
space directions were discretized using ng - n,p - 256. 



10" 



io'° l 

990 995 1000 1005 1010 
>- (A) 

Fig. 2: The results of 1-D calculations for a scattering line are 
compared with 3-D calculations for y6„,„i = 0.03. The momen- 
tum space directions were discretized using ng - n^ = 512 
and the calculation was run on the Franklin Cray XT4 using 
2'^ = 16384 processors. 



uses 64 radial points, distributed logarithmically in optical depth. 
Comparing the line mean intensities / as function of distance 
from the center for both the ID and the 3D solver we again found 
excellent agreement. 

5.3. Tests witli line scattering 

We have run a number of test calculations similar to the UTE 
case but with line scattering included. In Fig.[2]we show the co- 
moving J I as a function of A for e/ = 0.1 with (3,„„^ = 0.03 and 
ne - n<l) - 512. The 3D calculations compare very well to the 
ID calculations. The small variation in some parts of the surface 
of the sphere (each black line represents a pixel on the surface of 
the sphere) is due to the different wavelength discretization. The 
3D case uses only the simple method described above, whereas 
the ID case uses the full Crank-Nicholson-like method described 
in |Hauschildt & Baron| ( (2004 al). 

Figure [3] shows the wall-clock time as a function of resolu- 
tion (number of momentum-space angles ng and n^ or number 
of CPUs). The computational work is kept constant with each 
CPU required to calculate 16 characteristics. The modest (14%) 
increase in wall-clock time from 16 CPUs to 16,384 CPUs is ac- 
ceptable given the huge increase in communication required and 
the fact that load balancing is quite simple. The wall-clock time 
for this test was about 625 s for each direction angle. The 
memory usage is controlled by the size of the spatial grid 
(this is also true in Papers I-IV). The only additional storage 
is that the values of the Intensity for both the previous and 
current wavelength point must be stored. The total memory 
per process was about 100 MB. 
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Fig. 3: The wall-clock time to solve a scattering line e = 0.1 on 
the Franklin Cray XT4 as a function of momentum frame angu- 
lar resolution. The test was run so the amount of computational 
work per processor was constant. The roughly 14% communica- 
tion time increase from 16 to 16,384 processors is acceptable. 



6. Wavelength Parallelization 

We have implemented and tested a "pipeline" wavelength par- 
allelization method using wavelength clusters in the manner de- 
scribed in Saron & Hauschildt'('1998 ). As in |Baron & Hauschildt] 
( |1998) , the parallelization over characteristics is done within a 
"wavelength cluster" and each worker thus must send only its 
values of the specific intensity to the corresponding process in 
the next wavelength cluster. For the simple test problems con- 
sidered so far the opacity calculation is trivial and hence, there is 
no speedup (but also no penalty) for this wavelength paralleliza- 
tion. However, in real world problems the time to calculate the 
opacities is roughly equal to the time required to solve the trans- 
fer equation and this leads to linear speedups in the ID case of 
up to about a factor of eight. 



7. Conclusions 



We have implemented the affine method described in Chen et al. 
( 2007 | l for the case of homologous flows and shown that it gives 
excellent results compared to the full co-moving method that we 
use in our ID code. We have also been able to parallelize it both 
over characteristic and wavelength. The characteristic scaling is 
excellent and immediately brings us into the forefront of mas- 
sively parallel computation. The next step is to include the 3D 
calculations in the full real world code and begin applying it to 
numerous astrophysical problems. 
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Appendix A: Determining s from Coordinates 

The characteristics are followed through the voxel grid from an 
entry boundary point to an exit boundary point. It is convenient 
to choose some particular voxel as a "starting point" ro and deter- 
mine the distance s to the two boundary points. We will choose 
our sign convention such that the distance to the entry point is 
negative and the distance to the exit point is positive. Given a 
starting point ro we can find the distance to a boundary point R 
(where R = Rm, or R = /?out) from Eq. Q and find 

+ 2(n ■ ro)s + (rl - R^) ^ (A. 1) 

The characteristics can be divided into three classes. Tangential 
characteristics (those that do not hit the inner boundary Ri„ have 

R - Rout, 

and satisfy the constraint that the impact parameter is greater 
than Rin, 



rl-(n-rof>Rl 



For this case, Eq. ( A. 1 1 has two solutions 



= -(n ■ ro) - ^(nTo)2-(r2-/?2), 



and 



= -(n ■ ro) + ^(n-ro)2-(r2-/?2), 



such that 

i- < < i+ 



(A.2) 



(A.3) 



(A.4) 



(A.5) 



Core-intersecting characteristics include two cases, incom- 
ing and outgoing rays. Incoming core-intersecting characteris- 
tics are determined by 

n ■ ro < 0, (A.6) 

where 

R - Rout, 

and there is only one solution 

^_ = -(n ■ ro) - ^(n-ro)2-(r2-/;2), (A.7) 
the other solution 

= -(n ■ ro) + ^(n-ro)2-(r2-7?2), (A.8) 
should be dropped because it passes through the core. Here 
i_ < < i+. (A.9) 
For outgoing core-intersecting characteristics 
n-ro>0. (A. 10) 
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this time the characteristic should start from the core 

R = Rin, 

For this case 

= -(n ■ r«) + ^(n-vo)^-(rl-R2), (A.ll) 
is the desired solution and the other solution 

= -(n • ro) - ^(n-ro)2-(r2-/?2) (A.12) 
passes through the core. In this case, 

0> s-> (A.13) 



